Multidimensional Analysis of Distributed Xml Data

نویسنده

  • Sambit Pradhan
چکیده

The expeditious proliferation of the internet to ubiquity, the infrangible dependence of global enterprises on Web services, the universal adoption of SOA, cloud computing, social media and online publishing has made XML the lingua franca of the digital age and has generated a plethora of data in XML. The immense popularity of NoSQL and document-oriented data stores have also added tremendously to this trend. At the same time the need for cost effective, low maintenance, simple, customizable and highly scalable analytical systems for small and medium size businesses, information and measurement companies, academic and research institutions. This paper presents a novel approach for dimensional analysis of distributed, disparate, heterogeneous, voluminous XML data, the Multidimensional Analysis of Distributed XML data – MAX – a scalable, high-performance, open source, schema-free, document-oriented database. The primary objective of the paper is to propose an architecture for a document-oriented database, including details of its foundation data structures and querying mechanism; based on existing standard technologies for multidimensional analysis of large set of XML data. The Motivations for this approach include simplicity of design, generality, cost-effectiveness, usability, horizontal scaling, storage efficiency, minimal use of memory and resources. The method has virtually no memory limitation or data set size limits and performs relatively well in terms of data latency and resource consumption. The paper details an implementation of this method along with sample performance

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses

Recently, a large number of XML documents are available on the Internet. This trend motivated many researchers to analyze them multi-dimensionally in the same way as relational data. In this paper, we propose a new framework for multidimensional analysis of XML documents, which we call XML-OLAP. We base XML-OLAP on XML warehouses where every fact data as well as dimension data are stored as XML...

متن کامل

A Multidimensional Data Structure for Maintaining XML Data Partitions

To achieve good performance of processing queries on huge XML data in cluster machines, data partitioning and placement strategy is one of the key factors. In this paper we propose a multidimensional data structure for maintaining XML data partitions, specifically for holistic twig join processing. Initially, we construct the multidimensional data structure from statistical information on vario...

متن کامل

Meta Cube-X: An XML Metadata Foundation for Interoperability Search among Web Data Warehouses

OLAP (Online Analysis Processing) applications have very special requirements to the underlying multidimensional data that differs significantly from other areas of application (e.g. the existence of highly structured dimensions). In addition, providing access and search among multiple, heterogeneous, distributed and autonomous data warehouses, especially web warehouses, has become one of the l...

متن کامل

Multidimensional Anlaysis of XML Document Contents with OLAP Dimensions

With the emergence of Semi-structured data format (such as XML), the storage of documents in centralised facilities appeared as a natural adaptation of data warehousing technology. Nowadays, OLAP (On-Line Analytical Processing) systems face growing non-numeric data. This chapter presents a framework for the multidimensional analysis of textual data in an OLAP sense. Document structure, metadata...

متن کامل

Fragmenting very large XML data warehouses via K-means clustering algorithm

XML data sources are more and more gaining popularity in the context of a wide family of Business Intelligence (BI) and On-Line Analytical Processing (OLAP) applications, due to the amenities of XML in representing and managing semi-structured and complex multidimensional data. As a consequence, many XML data warehouse models have been proposed during past years in order to handle heterogeneity...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015